In this notebook the goal is to compute the level of masculinity of an individual and a feature, that is, \(p(i,j)\) denotes the probability of an individual \(i\) demonstrate masculinity for feature \(j\).

In the E-step of the EM algorithm we compute the above conditional probabilities. The first step is to extract these probabilities and create a new table of id and probabilities.

Residual data

Full data

First select only FDR significant features from the antidiagonal:

extract feature data:

join the tables:

Cohen D histogram

hist(sapply(significant_features, function(x){x$hypothesis_results$cohen_d_test$estimate}),xlab = "Cohen-D", main = "Histogram of Cohen D", sub = "Residual data")

Correlation matrix

corrplot(combined_correaltion, order="alphabet", title="both genders - full data",
         tl.cex = 0.1,tl.srt = 45)

corrplot(combined_correaltion[FA_features,FA_features], method="color", title="FA",
         tl.cex = 0.1,tl.srt = 45)  

corrplot(combined_correaltion[MD_features,MD_features], method="color", title="MD",
         tl.cex = 0.1,tl.srt = 45)  

corrplot(combined_correaltion[Volume_features,Volume_features], method="color", title="Volume",
         tl.cex = 0.1,tl.srt = 45)  

Anti diagonal

First select only FDR significant features from the antidiagonal:

extract feature data:

join the tables:

Cohen D histogram

hist(sapply(significant_features, function(x){x$hypothesis_results$cohen_d_test$estimate}),xlab = "Cohen-D", main = "Histogram of Cohen D", sub = "Residual data")

Correlation matrix

corrplot(combined_correaltion, order="alphabet", title="both genders",
         tl.cex = 0.1,tl.srt = 45)

corrplot(combined_correaltion[FA_features,FA_features], method="color", title="FA",
         tl.cex = 0.1,tl.srt = 45)  

corrplot(combined_correaltion[MD_features,MD_features], method="color", title="MD",
         tl.cex = 0.1,tl.srt = 45)  

corrplot(combined_correaltion[Volume_features,Volume_features], method="color", title="Volume",
         tl.cex = 0.1,tl.srt = 45)  

Standardize data

Anti diagonal

First select only FDR significant features from the antidiagonal:

extract feature data:

join the tables:

Cohen D histogram

hist(sapply(significant_features, function(x){x$hypothesis_results$cohen_d_test$estimate}),xlab = "Cohen-D", main = "Histogram of Cohen D", sub = "Residual data")

Correlation matrix

corrplot(combined_correaltion, order="alphabet", title="both genders",
         tl.cex = 0.1,tl.srt = 45)

corrplot(combined_correaltion[FA_features,FA_features], method="color", title="FA",
         tl.cex = 0.1,tl.srt = 45)  

corrplot(combined_correaltion[MD_features,MD_features], method="color", title="MD",
         tl.cex = 0.1,tl.srt = 45)  

corrplot(combined_correaltion[Volume_features,Volume_features], method="color", title="Volume",
         tl.cex = 0.1,tl.srt = 45)